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Abstract 

This paper deals with statistical tests on the components of mixture 
densities. We propose to test whether the densities of two independent 
samples of independent random variables Yi, . . . ,¥„ and Zi, . . . , Z„ re- 
sult from the same mixture of A/ components or not. We provide a test 
procedure which is proved to be asymptotically optimal according to the 
minimax setting. We extensively discuss the connection between the mix- 
ing weights and the performance of the testing procedure and illustrate it 
with numerical examples. This link had never been clearly exposed up to 
now. 



1 Introduction 

1.1 Mixture model with varying mixing weights 

Since more than 20 years, the mixture model has gained a lot of attention. This 
is due to its ease of interpretation by viewing each component as a distinct 
group in the data. This model has been widely applied in several areas such as 
finance, economy, biology, astronomy, survey methods,... 

Most of the theoretical results in the literature deal with the estimation of the 
components or of the mixing weights. There are two types of mixture models : 
the most popular one has fixed mixing weights and the other one has varying 
mixing weights. 

On the one hand, many statisticians have been interested in estimating the 
mixing weights. For example. Hall [12], Titterington [531 and Hall and Tit- 
terington |13| have considered nonparametric estimation of the mixing weights. 
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Two other examples about the mixing weights are the estimation of a func- 
tional of the weights by van de Geer [25] and the computation of confidence 
intervals by Qin [35]. On the other hand, one can be interested in estimating 
the components of the mixture. This can be easily done with varying mixture 
weights by applying several well-known methods such as histograms in Lodakto 
and Maiboroda [TB] , empirical distribution in Maiboroda [H] or wavelet thresh- 
olding methods in Pokhyl'ko [20'. Finally, the mixing weights and the mixture 
components can also be estimated both and at the same time, this result holds 
in a particular setting for k-variate data introduced by Hall and Zhou [Tl] . 

More recently, the mixture model has also been studied in the testing problem 
framework. The usual addressed question is whether the observations come from 
a non-trivial mixture model or from a trivial one (i.e. with only one component). 
This has been done for example by Garel [TU] and [TT] and Delmas [S] in the 
case of fixed mixing weights and by Maiboroda [TH] in the case of varying mixing 
weights. Their homogeneity tests which rely respectively on the likelihood ratio 
test and on a Kolmogorov-Smirnov type test are proved to be consistent. Here 
we propose to study a testing problem with two samples in a mixture model 
with varying mixing weights. 

Although the varying mixing weights model does not seem natural at first sight, 
on can think of several situations where it can be useful. Let us give three 
examples that will help the reader to recognize its usefulness. 

Social science 

This first example is the closest to the varying mixing weights model that is stud- 
ied here. Let us consider an organization divided into several departments such 
as an enterprise. Aggregated informations are only known at the department 
level, e.g. proportion of men and women, proportion of graduates and under- 
graduates, proportion of married and unmarried people, etc... The researcher 
is interested in a variable for these subgroups such as salary. For each person, 
the researcher has only recorded salary and department. The information of 
interest which allows to divide the sample into subgroups is unavailable at the 
individual level. This can happen if the researcher has forgotten to record this 
information when collecting the data; this frequently happens when a new ques- 
tion arises during the study of the data. Another reason can be that the law 
forbids to record such information at the individual level; for example this is 
the case of origins or races in many countries. There is a wealth of works on 
partially missing data (see McKnight et al. 17 for example) but the case of 
entirely missing data has never been really considered. From our point of view, 
a varying mixing weights model is a way to cope with this lack of information 
at the individual level and to allow the researcher to reconstruct information for 
each subgroup. Although we are aware of methodological problems, we want 
to emphasize that in this case the varying mixing weights are exactly known to 
the researcher; indeed, aggregated information often exists and is much easier 
to collect than individual information. 
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Image analysis 

Let us assume a simple picture taken at a party and consisting of people and 
background. There is usually no way to distinguish at the pixel level whether it 
comes from people or background. Nevertheless one can think of some kind of 
aggregated information to roughly divide the image into several areas. In the 
center of the image, there are usually mainly people and only little background. 
In the area surrounding the center, there is mainly background although few 
people can be scattered here and there. Therefore the image can be divided 
into two areas. This written description of the image can be translated into 
a mathematical description namely the varying mixing weights model. In this 
model, the statistician will be able to extract distinctive features of the picture 
concerning people or background. We are aware that methodological problems 
can appear in this setup. For example, spatial structure are not taken into 
account. One can consider that the weights are only roughly known which can 
be a problem. Nevertheless for some types of images, such as satellite images, 
one can assume that the weights are accurately known. Indeed, as the area 
under scrutiny is exactly known from a geographical point of view, one can use 
aggregated information about surfaces such as proportion of forest, land, city, 
water, etc... 

Finance 

Mixture densities have been proved to be useful in volatility modeling (see Bern- 
hard and Leblang [3] , Avellaneda [5] for example) . If one consider the volatility 
clustering effect (see Cont [6] for example), one can roughly divide time into 
periods where the proportions of high and low volatility are estimated. Indeed 
during each period it might be hard to exactly label observations corresponding 
to low or high volatility. Therefore the varying mixing weights model can be 
considered and help to extract useful features of the mixture components. This 
case with estimated proportions is not solved here. Although it is beyond the 
scope of this paper, we briefly discuss it in Sectional 

Let us now come back to our testing problem with two samples in a mixture 
model with varying mixing weights: let Yi, . . . , and Zi, . . . , Z„ be two inde- 
pendent n-samples of independent random variables. We propose to study in 
this paper whether these two samples of random variables come from the same 
mixture of M unknown densities p« (1 < u < M) or not. We assume that the 
mixing weights associated with each observation are available to the statisti- 
cian. In Butucea and Tribouley [4] some procedures are proposed to test if two 
n-samples of i.i.d. variables have common probability density. Their setting 
is equivalent to the case M = 1 in our mixture problem. Here the problem 
appears more complex since the two samples are not based on random variables 
with the same marginal densities. Our results show that there is no loss in the 
minimax rate compared to the simpler case studied by Butucea and Tribouley 
[3]. In Section [5] we provide an asymptotically minimax test which is based on 
wavelet methods and we prove the dependence between the mixing weights and 
the constants appearing in the definition of the minimax rate of testing. Until 
now this phenomenon has never been studied and is extensively discussed in 



3 



this paper. In addition to our theoretical result some numerical experiments 
are given in Section [3] in order to illustrate the strong connection between the 
mixing weights and the performance of the test. As expected, our test performs 
very well for various mixture models. Sections 2] and [S] are respectively devoted 
to possible extensions of work and to proofs of main results. 

Here we introduce the wavelet framework that will be used. 
1.2 Wavelet framework 

We first recall that wavelets have been often applied in different mathematical 
fields such as in approximation theory, in signal analysis and in statistics for 
instance. In particular, many recent statistical works on estimation (see among 
others Autin [I], Donoho et al [9j, Cohen et al [5^ ) and on hypothesis testing 
(see Spokoiny [25) use the wavelet setting to provide efficient estimators and 
tests. There are many explanations for the huge interest of the wavelet setting. 
One of them is that wavelets bases are localized both in frequency and in time, 
contrary to the classical Fourier basis which is only localized in frequency. As 
a consequence, the wavelet setting appears to be well adapted to describe local 
characteristics of a signal to be reconstructed. 

Let (j) andV' be two compactly supported functions of L2(&) and denote for all 
j in N and all fc in Z and all x in M, (f)jk{x) = 2^^ x — k) and ipjk{x) = 



Suppose that for any j in N; 

• {4>jk,'>iJj'k'-, j' > j; G constitutes an orthonormal basis of i2(R), 

• support{4>) U support{ip) C [—L, L[ for some L > 0. 

Some most popular examples of such bases, called compactly supported or- 
thonormal wavelet bases, are given in Daubechies 0. The function is called 
the scaling function and ip the associated wavelet. 

Any function h in L2(R) can be represented as: 



2"'V(2'x- fc). 



h{t) = ajk(t>3k{t) + X! 51 f^jk^j'k{t) 



kez j'>jkei 



where Vj € N,Vj' > j,Vfc e Z: 




• = {xeM.--L< Vx-k < L} = [!^,^[. 



Let us now describe the testing problem we focus on. 
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1.3 Mathematical description of the testing problem 

Let Yi, . . . ,Yn be a sample of independent random variables with unknown 
marginal densities 

M 

fi{-) = '^^u{i)Pu{-), 1 < i < n, 

and let Zi, . . . , Zn be another sample of independent random variables with 
unknown marginal densities 

M 
u=l 

We also assume that the two samples are independent. 

Here and in what follows, wc suppose that the mixing weights 1 < u < 

M, 1 < i < n) and (cr„(i), l<u<M,l<i<n) are known to the statistician 
and satisfy 

• V(M,i) e {!,..., M} X min(w„(i),o-„(i)) > 0, 

M M 

• Vi e {1, . . . , n}, ^ = ^ cr„(i) = 1, 

u—l u—1 

and are known by the statistician whereas the densities pu and g„ (1 < u < M) 
are unknown. 

Let us denote = {pi, . . . ,pm) and = (qi, . . . , qm) ■ 

We study in this paper a nonparametric procedure to test whether the samples 
result from the same mixture of densities. Let V denote the set of all probability 
densities with respect to the Lebesgue measure on M. For any real number 
R> 0, we define 

Go {R) = {(^, t) : Vw e {1, . . . , M}, p„ = g„ e S{R)} 

where S{R) =Vr\L^{R)r\ L2(i?). 

We consider the following null hypothesis 

Ho: (t,t)eeo(i?). 
For a given C > 0, we define 

ei{R,C,n,s) = {(^,t):Vue {!,..., M},p„-g„eB|,^(ii), 

3u e {!,..., M},(p„,g'„) 

where A„(i?,C) = {{p,q) & {V nl^ooiR))^ , lb~ q\\2 > Cvn} , for a sequence r„ 
tending to when n goes to infinity and ,B| (i?) is the ii-ball of a functional 
space defined below. We consider the following alternative 
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As usual in the nonparametric setting, we focus on a large class of functions 
having some regularity so as to derive optimal properties. For the chosen wavelet 
basis, the space S| ^{R) represents the i?-ball of the so-called Besov body which 
is composed of all the functions h £ £2(K) for which the sequence of wavelet 
coefficients {ajk, (ij'k,j G N, j' > j,k £ Z) satisfies: 



The minimax setting 

In this paragraph we recall the minimax approach which is often used to eval- 
uate the performances of testing procedures. Given the sum of the probability 
errors, say 7 € [0, 1], we study the optimal separation rate r„ between the null 
hypothesis and the alternative. This rate r„ is the best possible rate separating 
at least one of the M couples of density components pu and g„. It is usually 
called the minimax rate. Let us recall the classical definition for the separation 
rate. 



Definition 1.1 Let < 7 < 1. We say that rn is the minimax rate separating 
Jio and Hi of our testing problem at level 7 if the two following statements are 
satisfied: 

1. there exist a sequence of test procedures A* and a constant C-y such that 

limsupl sup P^,^(A; = 1)+ sup P^,^(A; = 0) I < 7 (1) 

n^oo \(^,-^)eeo{R) (7^,-^)eei(fl,c,n,s) J 

for all C > Cj; 

2. there exists a constant c-y such that 

liminfinff sup P^^(A = 1)+ sup P^^(A = 0)|>7 (2) 

ntooo A ' (7^,^)eei(fl,C,n,s) ' J 

for all C < c~f, where the infimum is taken over all test procedures A. 
Hypothesis on the model 

In our study we suppose that the mixing weights (w^ (i) , 1 < u < M, 1 < i < n) 
and (cr«(«), 1<M<M, l<«<n) satisfy an added hypothesis. Let us denote 
by = i^)u.i the matrix with coefficients flu,i — i^ui'i) and E — {^)u,i the 
matrix with coefficients I]„_j = cTuii)- 

HYP-l The smallest eigenvalues of the (M x M)-matrices r„ = flfl* and T'^ = 
ES* are both larger than or equal to Kn, with < K < 1. 

We recall the following proposition due to Maiboroda [I9] . 
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Proposition 1.1 Suppose that the previous conditions are satisfied by the mix- 
ing weights (uju{i), ^ l£ u < M, I < i < n) and (cr„(j), 1 < u < M, I < i < n) 
associated with the model. Then, there exists a solution of the two problems 

n 

[ find ai = {ai{i),i ^ I,. . n} such that <uJk,ai >„:= ^ ^ ujk{i)ai{i) = 4; ] , 

i=l 
n 

[ find bi = {bi{i), i = 1, . . . , n} such that < crk,bi i ^ ak{i)bi{i) = Ski ] , 

1=1 

where Ski is the Kronecker delta. According to HYP-l this solution satisfies 

M ^ M n j^j. 

^ < a;,a/ >„:= -^^af(i) < — , (3) 

1=1 ^ 1=1 i=l 

M M n , 

^<6,6,>„:=i;^^6fW<|. (4) 

1=1 1=1 1=1 

2 Nonparametric test procedure 

This paragraph deals with the case where the regularity s of the Besov body 
that appears in 'Hi is known. From now on we denote by ai and bi the n- 
vectors which are the solutions of the two optimization problems appearing in 
Proposition ll.il Let us describe the asymptotically minimax decision rule. 

2.1 Definition of the test procedure 

For each level parameter j, we define the test procedure Aj comparing the test 
statistic 



" 51 \'^l{^l)^jk{y^^)-bl{il)4)Jk{Z,J\[al{i2)(t)Jk{Y^^)-hl{i2)(t>J^^^ 

l—l k ii^i^ 

with a threshold value tn — t where i is a constant chosen later. We define 



1 ifr,>t„, 

ifr,<i„. 



2.2 Properties of the test statistic 

In this section, we provide two propositions which will be crucial when evaluating 
the performance of our test procedure. They deal with the behaviors of its 
expectation and its variance. 

Proposition 2.1 Let j be any given level parameter. Then, 

(r^) = EE /(p'- 500.0 --EEE -feK%O0.fe • 
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Remark 2.1 For the particular case where the sequences of the mixing weights 
(ujuii), 1 < M < M, 1 < i < n) and ((t„(«), 1 < w < M, 1 < i < n) are identical, 
the test statistic Tj is centered under the null hypothesis. 



Corollary 2.1 For any j G N, 

Proposition 2.2 There exists a constant = C^{R,L, \\(f>\\oo) > such that 

(nj 1 / oj \ /\,/f2 

Remark 2.2 Under the null hypothesis the variance of the test statistic Tj is 
less than or equal to C^.IvPK^'^ 2^ n~^. 



Kn 



2.3 Minimax performance of the test procedure 

For any s > 0, let (r„)„gN be the sequence such that 

r„ = n^T^ Vn G N*. 

The following theorem shows that the test procedure defined in section [2] pro- 
vides an accurate upper bound when it is well calibrated. 

Theorem 2.1 (Upper bound) Fix < 7 < 1 and consider the test procedure 
A* = Aj^ where jn is the smallest integer such that 2"^" < n 1+*= . Let t and 
C-y be two positive real numbers defined as follows : 




Then 

limsupl sup P^,^(A: = 1)+ sup P^,^(A: = 0) ) < 7 (5) 

n^aa 600 (-R) ,-^)e0i{R,C,n,s) 

for all C > C-y. 
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Although the exact value of the constant Ct is very complicated, it can be 
exactly calculated by following the proofs. 

Now, let us focus on the lower bound associated with our nonparametric testing 
problem "Ho versus "Hi. 

We aim at providing a constant such that we ensure that no test procedure 
is able to choose T-Lq or Hi with a sum of the probability errors less than 7 
(0 < 7 < 1). Obviously, the smaller the distance between and the more 
accurate our results. The next theorem proves that our test procedure is asymp- 
totically minimax. 

Similarly to the classical methods for providing lower bounds (see for instance 
Gayraud and Pouet [H] or Butucea and Tribouley [51 ) we shall consider a sub- 
space of A„(i?, C) that is, for any chosen Ci > 0, 



A„(i?, C, Ci) = (p, q) e A„(i?, C); inf min(p(z), g(z)) > CA . (6) 
Theorem 2.2 (Lower bound) Let 0<7<1, s>0 and let > satisfy 



ct 



/~i2 \ 9— 4s 



ln[4(l - 7)^ + 1] A 2R' 



Then for all C < 



liminfinf sup P^^(A = l)-f sup ^(A = 0) > 7 (7) 



,i})eeQ(R) (t,~!t)eei(R,c,n,s) 
where the infimum is taken over all test procedure A. 



From Theorems l2.ll and l2.2l we deduce the minimax rate of testing. It is the same 
as the one found by Butucea and Tribouley [4] when there is only one subgroup. 
Advances in our results are the extension to the varying mixing weights model 
which allows non-identically distributed random variables compared to Butucea 
and Tribouley [4] and the role played by the mixing weights which is clearly 
exposed. 

Corollary 2.2 For any s > 0, the test procedure A* is asymptically minimax 
and the minimax rate separating "Ho and Hi is r„ = n 1+2= . 



2.4 Discussion about the constants and 

In the two previous theorems we exhibited two constants appearing in the upper 
and the lower bounds. We think that the connection between these constants 
and the model's parameters M and K is a novelty and really deserves a discus- 
sion. Indeed, we keep in mind that 

• C-y is the minimal value for C such that our test statistic is able to detect 
if all the mixture components are identical in the two populations with 
the sum of the probability errors not exceeding 7; 
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• c-y is the maximal value for C such that no test statistic is able to detect if 
all the mixture components are identical in the two populations with the 
sum of probability errors not exceeding 7. 

As a consequence we proved that our test statistic is optimal in the minimax 
sense since it attains the minimax rate of convergence separating "Ho and "Hi. 

According to the definitions of and C-y we let the reader be aware that: 

• the smaller the constant K, the larger the family of the mixing weights 
satisfying HYP-1; 

• the smaller the constant M, the bigger {— the worse) the constant Cj and 
the bigger the constant c^; 

• the smaller the constant K, the bigger (= the worse) the constant C-y and 
the bigger the constant c^. 

Although the exact separation constant is not established in this study (since 
c-y ^ we prove that and strongly depend on the smallest eigenvalue 
of the matrices fin* and SS*. 

3 Numerical experiments and application 

The aim of this section is twofold: to illustrate by numerical experiments the 
good performance of the test procedures based on the statistics Tj^ and to show 
the usefulness of our method on real data. 

First, 2 examples of mixture models are given to show the interest of the problem 
we have considered. Next we illustrate the behaviour of the test statistics . 

3.1 Examples of mixture models 

Figure [TJ [Mixture with two components] 

Consider two populations sampled from the same mixture densities such that 

• the size of the two populations {Y, Z) is n — 500, 

• the ranks of the matrices of the mixing weights CI* and S* are 2, 

• the two components of the mixtures are the uniform density ( [— 1 , 0] ) 
and the normal density A/'(3, 4). 

Figure [2| [Mixture with three components] 

Consider two populations sampled from the same mixture densities such that 

• the size of the two populations {Y, Z) is n = 500, 

• the ranks of the matrices of the mixing weights fl* and S* are 3, 

• the three components of the mixtures are the normal densities M{—2, 1), 
AA(0, 1) and Af{2, 1). 



10 




Figure 1: Histogram (a) of population Y and histogram (b) population Z. 




-a -6 -4 -2 2 4 6 8 10 12 -8 -6 -4 -2 2 4 6 8 10 12 



Figure 2: Histogram (a) of population Y and histogram (b) of population Z. 



The histograms of the observations are quite different in Figures 1 and 2, al- 
though they correspond to mixture models with the same components. So the 
previous schemes show how hard it is to guess whether the mixture components 
of the two populations (Y, Z) are exactly the same or not. Hence, it justi- 
fies that the statistician needs an adequate test statistic to decide whether the 
populations (Y, Z) have the same mixture components or not. 

3.2 Construction of the test procedure: calibration of t„ 

In the theoretical part of this paper we provide a decision rule to test Hq against 
T-li . This decision rule Aj^ relies on the sign of Tj^^ — 1„ , where i„ is the threshold 
value depending on the sum of the errors 7 and Tj^ is the test statistic. In the 
positive case (resp. in the negative case) Aj^ proposes to accept Hi (resp. "Ho)- 
From the practical point of view, we give some hints to adjust the threshold 
value tn- Here we use the Haar basis and we set s = 4. For this, we consider 
two different approaches. 
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The first approach consists in fixing the first type error, < 71 < 1, and in 
choosing t„ as the quantile of order 1 — 71 of the test statistic obtained after 
1000 replications of the chosen mixture model. 

The second approach consists in choosing f„ as the value for which the sum of 
the two errors is the minimal one according of the statistic of test obtained after 
1000 replications of the mixture model chosen. 

3.3 Connection between K and the performance of the 
test procedure. 

The aim of this paragraph is to illustrate the connection between the value of K 
and the performance of our test procedure. We provide simulations of Gaussian 
mixture models and we give for several values of n 

• the value of tn associated with a first type error equal to 10%, 

• the power of the test procedure based on the threshold value t„, 

• the minimum of the global error j^pt - the sum of the first type and the 
second type errors - reachable by the test procedure, 

• the value topt which corresponds to the global error "fopt- 

We consider two samples: Yi, ...,¥„ and Zi, . . . , Z„. Two mixture components 
are such that 

• under Hq, pi{-) = qi{-) ~ Af{-2, 1) and P2{-) = q^i-) ~ AfiS, 4), 

• under ~A/'(-2,l),p2(-) ~-AA(3,4), gi(-) - AA(0, 1) and g2(-) ~ 

Weights of samples Y and Z for Gaussian Model 1 are described in Table 1. 



Sample 


Range of i 


(Ji(i) or wi(i) 


0-2(1) or 012(1) 


Y 


z= l,...,0.8n 


0.6 


0.4 


i = 0.8 ri + 1, . . . , n 


0.4 


0.6 


Z 


?; = 1 0.3 n 


0.2 


0.8 


i = 0.3 n + 1, . . . , n 


0.5 


0.5 



Table 1: Model 1 



The results arc given in Table 2. We point out that the constant K related to 
the smallest eigenvalue is very close to 0. Therefore we expect poor results. 
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Gaussian Model 1 


n = 200 


n = 500 


n = 1000 


t„ 


0.289 


0.135 


0.080 


Vowcv 




(■,H.l% 


83.7'/. 


lopt 


52.6% 


38.2% 


23.5% 


iopt 


0.022 


0.080 


0.092 



Table 2: K = 0.013 



Weights of samples Y and Z for Gaussian Model 2 are described in Table 3. 



Sample 


Range of i 


ai{i) or wi(z) 


<j2(i) or Ci;2(i) 


Y 


z = l,...,0.8n 


0.8 


0.2 


i = 0.8 n + 1, . . . , n 


0.3 


0.7 


Z 


i = 1, . . . , 0.3 n 


0.1 


0.9 


i = 0.3 n + 1, . . . , n 


0.4 


0.6 



Tabic 3: Model 2 



For this setup, the constant K is almost three times the one appearing in Gaus- 
sian Model 1. Therefore we expect improved results. 



Gaussian Model 2 


n = 200 


n = 500 


n = 1000 




0.994 


0.061 


0.027 


Power 


85.2% 


91.5% 


96.8% 




24.6% 


16.3% 


9.5% 


^opt 


0.078 


0.103 


0.047 



Table 4: K = 0.033 



Weights of samples Y and Z for Gaussian Model 3 are described in Table 5. 



Sample 


Range of i 


(Ti(i) or LUi{i) 


(J2{i) or u}2{i) 


Y 


i = l 0.8 n 


0.8 


0.2 


i = 0.8 ;) + 1 ;/ 


().:-! 


0.7 


Z 


i^l 0.3 n 


0.9 


0.1 


i = 0.3 n + 1, . . . , n 


0.3 


0.7 



Table 5: Model 3 



In this setup, the constant K is more than five times the one appearing in 
Gaussian Model 1 and more than twice the one appearing in Gaussian Model 
2. Therefore we expect better results. 
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Gaussian Model 3 


n = 200 


n = 500 


n = 1000 




0.054 


0.030 


0.015 


V 


97.1% 


96.7% 


98.1% 


lopt 


10.5% 


9.6% 


6.5% 


iopt 


0.066 


0.064 


0.034 



Table 6 : K ^ 0.068 



According to numerical results in Tables 2,4 and 6, it is clear that for a fixed 
n, the larger the value of K, the better the performance of the test procedure. 
Indeed, when the first type error is 10%, we see that increasing values of K 
increases the power of the test procedure. Moreover, we remark that the optimal 
global error "/opt increases when the value of K decreases. In fact, this is not 
surprising as this behaviour was predicted by our theoretical results: the smaller 
the value of K the larger the constant Cj (see Theorem 12. ip . In other words, 
in a mixture model with a small value of K one needs a lot of observations to 
ensure good performance of our test procedure. 

3.4 Application to real data 

In this part we apply our results to real data. The dataset comes from a survey 
conducted by the french national statistical agency called InstituT National de 
Statistique et d'Etudes Economiques (abbreviated to INSEE). This survey called 
Declaration Annuelle des Donnees Sociales (abbreviated to DADS) took place 
in 2007 and is about employees and related variables such as salary, working 
time or type of jobs. All information regarding this survey can be found on the 
website of INSEE (see DADS 2007 postes et salaries, http://www.insee.fr ,). As 
far as we are concerned, we focused on working time per year. More precisely 
our goal is to make two comparisons at the same time: 

1. working time of men in Ile-de- France (region surrounding Paris in France, 
abbreviated to I below) and the one done by men in all other regions of 
France (abbreviated to V below), 

2. working time of women in Ile-de- France and the one done by women in all 
other regions of France. 

In this study we decide to only consider highly skilled workers such as executive 
staff, managers. There are two populations: 

• commercial and administrative staff (abbreviated to CAd), 

• technical staff (abbreviated to Tech). 

We restrict to people working more than 1 645 hours per year. The variable of 
interest is the number of working hours per year divided by 1 645. Therefore it 
is a ratio equals to or greater than 1. 

Available information about different subpopulations of T and V is gathered in 
the following table: 
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Ile-de-Francc [T) 


Other regions [V) 


Executive staff 


Men 


Women 


Men 


Women 


CAd 


58.99% 


41.01% 


67.96% 


32.04% 


Tech 


81.08% 


18.92% 


86.72% 


13.28% 



Table 7: Proportions of subpopulations by sex, area and job 



There are 65 558 people in X and 75 062 people in V. 

To begin, wc pay attention to the m,ean of the working-ratio of each population, 
namely mx and m-p. Although information about sex (men or women) is avail- 
able in the study conducted by INSEE, we assume that it is unknown in order 
to show the interest of our model. 

Let (Tj and a-p denote the standard deviations of population T and V according 
to the variable of interest. We suppose that a random sampling of order n = 
5 000 in each population is available and is conducted as follows: 

• 2 500 people living in I are CAd and 2 500 people living in I are Tech, 

• 2 500 people living in V are CAd and 2 500 people living in V are Tech . 
We are interested in the preliminary testing problem (7i): 

■Ho : mi = m-p vs Hi : mx ^ m-p. 
We decide to address this testing problem by using the test statistic 

^ _ |mr - mp\ 
V^'i + ' 

where rhi (resp. rhp) and (resp. a-p) denote the usual estimators of mi 
(resp. m-p) and ai (resp. a-p), when using stratified random samplings like 
ours. Under the null hypothesis 'Ho, the random variable U is asymptotically 
normally distributed with mean and variance 1. 

Here are the values computed from the samples: 



Ile-de-France {!) 


Other regions (V) 


mi = 1.1605 
ai = 0.0015 


m-p = 1.1531 
ap = 0.0014 



Table 8: Estimated means and standard deviations by area 

The value of the test statistic U is 3.5582. The related p- value is close to 0.0026. 
According to that, it strongly seems that mi ^ m-p. In other words, T-Lq is re- 
jected. 

At this stage, a natural question arises : what is the reason of such a diff'erence? 
Two hypotheses could explain it: 
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1. distincts values of mj and m-p arc only related to the different proportions 

of men (or analogously wornc^ii) bcrtwcx^n the two populations: 





Ile-de-Francc (I) 


Other regions (V) 


Men 


68.93% (45187) 


76.70% (57575) 


Women 


31.07% (20371) 


23.30% (17487) 



Table 9: Proportions of subpopulations by area and sex 



2. distincts values of mx and m-p are also related to different distributions of 
working-ratio of population X (abbreviated to W.RS-''^) and working-ratio 
of population V (abbreviated to W.RS^^). 

Trusting one of these new hypotheses becomes at first glance difficult to argue 
when only considering two random samples of size n in each population without 
the knowledge of sex (man or woman). Nevertheless, a way to address the 
testing problem (72): 

n'a : distributions of T4^.i?.(^)and M^.i?J''')conditionnally to sex are identical 
vs H'l : distributions of W.RS'''^and VKi?. conditionnally to sex are different 

is to consider our testing procedure. 

Let pi and p2 (resp. qi and (72) denote the density functions of the random vari- 
ables W.RS^'^lman and W.RS^'^lwoman (resp. W.RS''^'^\man and W.RS''^'^\woman)- 

The testing problem 72 can be written as follows: 

y-'o ■Pi=qi and P2 = q2 vs U'l : Pi ^ qi or p2 ^ q2. 

Observations of the working-ratio random variables li, . . . , y„ (resp. Zi,. . . ,Zn) 
in population I (resp. in V) are available. The mixture model we get is the one 
described in Section 1.3 with: 

• M = 2 and n = 5 000, 

• {uji{i),ui2ii)) = (0.5899,0.4101) for a n/2-tuple of indices, 

• {uji{i),ui2ii)) = (0.8108,0.1892) for a n/2-tuple of indices, 

• ((Ti(i),£72(i)) = (0.6796,0.3204) for a n/2-tuple of indices, 

• (c7i(i),£72(i)) = (0.8672,0.1328) for a n/2-tuple of indices. 

Let us describe the methodology of the testing procedure applied to these real 
data. We use the test studied in Section 2 with regularity parameter s = 4 and 
choose the usual Haar wavelet to construct our test statistic Tj^ . The threshold 
value of the testing procedure is computed according to the following heuristics: 
t = sta where ta is the 1 — a Gaussian quantile and s is the standard deviation 
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of the test statistics estimated by bootstrap (resampling is made 200 times). As 
we choose a = 10%, we have to.i = 1-28. 

The vakic of Tj_, obtained is tj, = 0.5412 whereas the threshold vahic is t = 
0.3324. Since tj^ is larger than the threshold value t, we conclude that there ex- 
ists a difference between the distributions W.RS-^^ and W.RS^^ conditionnally 
to sex. In other words, H'o is rejected. 

In this last paragraph, we study the numerical performances of our testing pro- 
cedure, built from Tj^ . For several values of n, a sample of size n is drawn from 
I (resp. P) and is divided into two subsamples : one subsample of size n/2 is 
drawn from the subpopulation CAd and the other is drawn from the subpopu- 
lation Tech. 



For each value of n, 200 samples are drawn. The results are gathered in the 
following table: 



Saiiipl(> size ii 


First txpc error: 




Firsi Ixpc error: 




Fowcr 


1 000 












0.110 


2 000 







0.005 




0.185 


3 000 


0.005 




0.005 




0.335 


4 000 







0.005 




0.530 


5 000 


0.005 









0.635 


6 000 


0.005 









0.745 


8 000 


0.005 




0.005 




0.925 



Table 10: First type error and power of the method 



— First type error E^' is the proportion of observations of Tj^ larger than the 
threshold value, when comparing two samples of size n in I. 

— First type error i?^-* is the proportion of observations of Tj^ larger than the 
threshold value, when comparing two samples of size n inV. 

— Power is the proportion of observations Tj^ larger than the threshold value, 
when comparing a sample of size n in I and a sample of size ninV. 

It appears that the testing procedure with the heuristically chosen threshold is 
very conservative. This is the only drawback of our methodology. Nevertheless 
the behaviour of the testing procedure is as expected: the larger the sample 
size the larger the power. As we see, for the eases n > 5 000, our testing pro- 
cedure is powerful. It tends to prove that there exists a difference between the 
working-ratios of the two populations conditionally to sex. 

This study on DADS 2007 demonstrates the usefulness of the varying mixing 
weights model. It really suggests that our testing procedure can be successfully 
applied to all types of data in social science. From our point of view, researchers 
in social science should consider the mixing varying weights model and our 
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testing procedure as soon as some information at the individual level has been 
omitted during a survey and is available at higher levels. 

4 Open questions 

As a conclusion, we have provided a statistical procedure for a testing problem 
on the mixture components of two populations (Y,Z). This one was proved 
to be optimal in the minimax sense (Theorems 12.11 and \2.2\i . In addition, we 
explained how the weights of the mixture model influence the performance of 
the statistical rule. All these theoretical results are illustrated by our numerical 
experiments. 

It seems to us important to give some hints about possible extensions of this 
work. From the theoretical and practical points of view, it would be interesting 
to study the same problem without assuming that the mixing weights are exactly 
known to the statistician. Several explanations can be given 

• the statistician can estimate the mixing weights for an observation by 
using covariates and an appropriate predictive model such as the logistic 
one, 

• a Bayesian approach is chosen for the mixing weights, 

• exogenous information allows the statistician to roughly estimate the mix- 
ing weights. 

In this case several natural questions arise 

• What statistical rule should be considered? 

• What kind of performance can be expected for such a rule? 

• How much do random mixing weights deteriorate the performance? 

Such questions are beyond the scope of this article and their answers certainly 
involve random matrices theory. 

Finally, it would be nice to show how to choose the adequate value of t„ in a 
better way than the complicated one given in Theorem 12.21 

5 Proofs of main results 

This section is devoted to the proofs of our results. The proofs often need 
technical lemmas which shall be proved in Appendix. For the sake of simplicity 
we sometimes omit ~^ and ~^ in the indices when there is no ambiguity. 
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5.1 Proofs of Propositions and Corollaries 

Proof of Proposition We refer to Maiboroda [TO] . A solution of the two 
problems is given for any {I, i) S {1, . . . , M} x {1, . . . , n} by 

M 

^ ' u—l 
M 

where 7,^ and 7'^ are respectively the minor (Z, u) of the matrix r„ and the minor 
{I, u) of the matrix Fj^. Inequalities ([3]) and ([4]) are obtained by using lemma l6m 

□ 

Proof of Proposition \2.1\ Let us evaluate the expectation of Tj . 

{ii)(j)jk{Zi^)){ai{i2)(pjk{yz2) - bl[i2)(|)Jk{Z^^)) 

\ i — 1 k J 

1 " 

5ZE E ["■i{ii)<t>3k{Yi^) - bi{ii)(t>jk{Zi^)]^^ ^-^ [ai{i2)(t>jk{Yi^) - hl{i2)(j)jk{Z^^)] , 



Z— 1 k 117^^2 



since the random variables {Yi-^, and (1^21 ^12) independent. 
We have for all 1 < i < n, 



M 



-f [az(*)</'jfe(^i) - bi{i)<Pjk{Zi)] = / {ai{i)uJu{i)pu ~ bi{i)au{i)qu) <l>jk- 

By introducing the diagonal term ii = i2 in the sum, we get 

^ M / . / n M n M \ \ ' 

= ^EE / '^J'^ EE"' "EE^''^*)'^"'^*)'?" ) ) 
1 = 1 k V"^* \i=l u=l i=l u=l I ) 



~ ~ E E E ( / ~ ^'(«).9*) 4>]k ] 
= EE --lEEE / 



n 1 ^ 

because of the two properties — > ai{i)uju(i) = Siu and — > bi{i)au(i) — 6iu- 

i=l i=l 

Thus the result for the expectation is proved. □ 
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Proof of Corollary \2.1\ 

According to proposition 12 . 1 1 we only have to bound the quantity 

^ M n , „ s 2 

" (=1 k ^=l ^-^K ^ 

Using the Cauchy-Schwarz inequahty and lemma [^31 we have 



M n , . \2Mn. „ 

1=1 k i=l ^•'^ ' i=l k 1=1 ■^^it' 



'^jk 



< 



< 



< 



M n 

EE 

1 = 1 i=l 
n M 

2EE 

i=l 1=1 L k •'^f' 
/ n M 

4i EE«?wii/^ii2+EE^?«ii 



E / iai{.i)ft-bi{,i)gif 



M 



\i=l 1=1 

SLMR^n 
K ■ 



i=l 1=1 



Last inequality is due to proposition 11.11 and the fact that for all 1 < i < the 
density functions fi and gi belong to h2{R). □ 

Proof of Provosition Let us consider the variance of Tj. For all (ii, j2), let 
hj (11,12) denote the quantity 

M 

hj (11,12) = EE ~ bi{ii)(f>jk{Zi^)) {ai{i2)(l>jk{yi2) - bi{i2)<f)jk{Zi2)) 

k 1=1 
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The variance of Tj satisfies 

= ^ Cov {hj(ii,i2),hj{i3,i4)) 

= ^ Var 12)) + ^ Cov {hj(ii,i2),hj{i2,ii)) 

+ ^ Cov {hj{ii,i2),hj{ii,i3 i,i2),hj{i2,is)) 

+ ^ Cou 12), /ij(i3, ii)) + ^ Cov{hj{ii,i2),hj{h,i2)) 

+ ^ Coi;(/ij(ii,i2),/ij(i3,i4)) 

Using independence arguments, 

A-j ^ E 'Cov{hj{ii,i2),hj{i3,i4)) =0. 

We are still required to bound for the quantities Ai {1 < i < 6). Since the ways to 
bound Ai and A2 (resp. ^3, A4, A5 and Ag) are similar, we will only bound Ai 
and A3. Such bounds are given in lemmas 16.71 and l6. 81 The proof of proposition 
T^is a direct consequence of Iemmas l6.7l and l6.8l bv taking ~2C^ V 4 C^.D 



5.2 Proofs of Theorems 

Proof of Theorem \2.1l 

Let us fix < 7 < 1 and s > 0. Under the null hypothesis, we use directly the 
well-known Bienayme-Chebyshev inequality. 



< 



< 



< 



\ '^^ Kn I 

M2 2^" 

„2 ^2 (i_ 8LMiIi)%4- 
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The last inequality is obtained using remark [221 According to the choices of 
the level j„ and the threshold i„ , we have 

M2 2^" 2C^ Af2 

< 



Then 

p^,^ (a: = i)<|. 

Under the alternative, we use the expectation of the test statistic and some 
approximation argument. The second type error is 

P^,^ (A: = 0) = P^,^ (-T,„ + E^^^(T,-J > -t„ + (^jJ) • 
The wavelet expansion in the Besov body B2 ^ leads to 

M AI ^ I, X 2 



^ M " f f Y 

~EEE / (M^)f^-^li^)9^)(|>3^k) 



M 



> 5:ih-gHI^Mi?2---5^-i„. 
1 

for any n large enough. 

As a consequence, applying the Bienayme-Chebychev inequality leads to 

\\pi - lih 



< 



V I I 

/ M 

n'K^ [-Y,\\pi-qi\\l-MR2 

\ 1=1 



The choice of j„ and the fact that the functions are in the alternative entail the 
following upper bound 



M \ 2 



rl 



1=1 
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According to the choices of j„, and r„, one gets for n large enough: 

\\pi - <lih 



< 



V I I 

3Ct-, 



'1 _ -R t \^ K'^r* ' 



For all C > C^, we finally obtain 

p^,^ (a: = o)<|. 

The results on the first-type and second-type errors show that if C > the 
sum of the errors is less than 7. Therefore the upper bound is proved. □ 



Proof of Theorem [KM 

Let 7 G ]0, 1[ , C > and Ci > 0. We define 

ei(i?,C,Ci,n,s) = [{f,t)-'^ue{l,...,M},p^~q^eBl^{R), 

3u e {!,..., M},{p^,qu) e A„(i?,C,Ci)}, 

where A„ {R, C, Ci ) is defined in ([B]) . It is well-known that 



inf sup P^^(A = 1)+ sup P^^(A = 0) 



> inf sup P^,^(A = l)-h sup P^,^(A = 0) 

\{^,~^)eeo{R) ' (^,-^)eei(R,c,Ci,n,s) 

> l-^l|lP^,lt-lP-||, 

where ||.|| is the Li- distance and tt is an a priori probability measure on the 
set A„(i?, C). First we define the probability measure tt and its support. Let 
9 = {9i, . . . , 9m) denote an eigenvector associated with the smallest eigenvalue 
of EE* - which is Kn according to HYP-1 - such that \\9\\^ = 1. 

Recall that here jn is the same as the one defined in theorem 12.11 Let T be the 
subset of Z containing every integer k satisfying the following properties 

• fcer=^ [w' w[c [0,l[; 

• (A:, fc') e r X r with fc ^ A:' =^ [^,.|±^[n [%^,*^ 
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The cardinal of T is clearly equal to T ~ [2^^^ — J and we denote its elements 



fci, . . . , fcr- The following parametric family of functions is considered 



qi,dz) = pi{z) + r+^Cy/ML 0i ^ ^2" 



where (k = +1 or —1. 

Remark that Cfc does not depend on the index I. Therefore the density of Zi is 



M 



M 



1=1 



1=1 



The probability measure tt is such that the C^'s are independent Rademacher 
random variables with parameter i . 

The function qi^(^ is a density. Indeed, for n large, qi,,^ is non-negative. Moreover, 
as V'jnfc is a wavelet, we have /V-'j„fc = and therefore /(?;.<; = 1. If C < 
a/-R/M22«+2, then g;,^ - pi belongs to the ball of the Besov body B^^^{R). 
There exists I such that 



M0f > 1 and lb, - g/,cll2 = TLMC'^2'^+^'-^^"'-^-ef > 



Therefore the probability measure n is solely concentrated on the alternative. 
It is well-known that the Li distance can be bounded by the L2 distance. We 
have 



< 




\ 



i\ 9i{Zi) ^ 



1. 



(8) 



Therefore it suffices to evaluate the second-order moment of the likelihood ratio: 

\ \ 2" 



n 



\ 9i{Zi) J J 

n / n ( i+2^+'c\/MZa 2- 



KkeT" i=l 

Let us introduce the following random variables 



M 



M 



9i{Zi) 
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We have 



n / n 1 + ^'^'cvml Ck 2- 



M 



9i{Zi) ^ 



^ / n n n \ 

= X{l[Y{{^ + '^Zik + Zl) + Y{{l-2Zik + Zl)+2^{l-Zl)\ 

^ n n 

= %,^[ni{2n(i+^'fc)+2n(i-^«^) 



feer i=i 

+ Zikhi{Zik, . . . , Zi-i^h, Zi^i^k, . . . , Znk)} 



i=l 



fcer \j=i i=i / 

T n 

+ ZikJl^Zik^ , • • • , Zn^k^_^ , ^ife^, . . . , Zi-i^k^, Zi^i^k^, . . . , ZnkrJ Zi^kr+iJ • • • ) -^n/cT 



where the functions /i, and /li are sums of products of their arguments. As 
E^^(Zife) = and ZikZik' = for fc ^ k', the last term vanishes. Thus we are 
only interested in the first term. 
Define for all k gT: 

^li^) = X/ ^iik^i2k ■ ■ ■ ^iik' 

l<ii<i2<---<ii<n 

hoik) = 2. 



Then, we have 



/ n n Y 

Ha 11(1 + + 11(1-^''=) 

.fcer \i=i i=i / 



^) n E ^'(^) 

M even 



;i,...,Zt even 
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T T 



h,...AT=0 

Ii,...,It even 



< 



< 



keT 



J2 



( /=0 

\i even 



E E ^ 

fceT \ /=0 l<ii<...<i;<r! 

\i even 



,.E 



ker \t=i 



z: 



ik 



''ik 



fcer \i=i / 



Each E 



is bounded as follows, 



Therefore this bound entails 



«pf^EfEE^,^(^-)Vl < exp f 1 ^ 2-+V^2-- 
\ keT \i=i / / y feer 

\ fesr 



a-2j„ 



at 



' n M ' 

E E ^l^"i^li^)'^rnii) 
L i — 1 Z,m — 1 J 



exp ( J2 2^^+''C^2-^'-^-^'"^^{Knf 



< exp 1 2*"+';^';^' 



CI 



Inequalities ^ and ([9]) lead to 



\\ft,t - P-ll < y exp (^24^+2 Af2i^2 _ 1. 

The choice of any constant C such that C < entails that the left-hand side 
of ^ is strictly smaller than 2(1 — 7). 
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□ 



6 Appendix 

This section contains the technical lemmas used in the proofs of the main results. 



Lemma 6.1 

M 

K 



i:i:<.?« < ^. (10) 



1 = 1 i=l 

EE^?w ^ (11) 

1=1 i=l 

Proof of Lemma \6.1\ 

The proofs of pop and (fTT|) are identical, that's why we only prove (fTO|) . Let 
Amin(r„) be the smallest non negative eigenvalue of the matrix r„. Let A = 
{A)i<j<n.i<i<M denote the (n x M) matrix with coefficients Aj^i = ai{j). Since 
the matrix AA* has at most M non negative eigenvalues, we have 

M n 

E E = trace{AA*) < M Xr^aA^A*). (12) 

1=1 i=l 

Clearly, the following implication holds 

A is a non negative eigenvalue of AA* n^X^^ is an eigenvalue of r„. 

So 

X^a.{AA*) < , (13) 
Lemma [Q] is proved by inequalities ([T^ and and under HYP-1. □ 



Lemma 6.2 For all {j, k) E Z x Z, let us put 

'k- L k + L 



Then for any fixed (j, k) 



23 23 



Card{k' £ Z : Ijk D Ijk' 7^ 0} < 4L. 
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Proof of Lemma[ 
Clearly, Ijk D Ijk' = -^=^ k' - L > k + L or k' + L < k - L. 
Hence, Ijk n Ijk' ^ ■^=^ k - 2L < k' < k + 2L. 
As a consequence, we have 

Card{k' e Z : I^k n /^fe' 7^ 0} < 4L. 



□ 



Lemma 6.3 For any function h E ii(M) 

[ \h{x)\dx < 2L\\h\\i. 

Proof of Lemma WM Let us define for any h G iyi(M) : 

Pjk{h)= [ \h{x)\dx, VjeN, VfceZ. 
Judging from the definition of the intervals Ijk, we easily prove that for any 

2L 2L „ 



U—1 



□ 



Lemma 6.4 Lei &e either Y or Z . For any 1 < i < n and any (j, k), we 

have 



|E(0,fe(M^,))l < (^2L sM\\pi\\^y\\qi\U ) 2 
Proof of Lemma \6.4\ 

Using the Cauchy-Schwarz inequality, we obtain 



\E{<l>,k{W.))\ < 



<t>jk fi 



(t^jk 9i 



< I \4>jk\ sup||p,||^V J \(l}jk\ sup|lg/|l^ 



< 2L sup(||p,|looV||9i|loo) 



□ 
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Lemma 6.5 Let W be either Y or Z and c be either a or b. For any I < i < n 

and any {j,k), the following inequalities hold 



sup 

I 



< 4L|10|U2^ 



sup|Q(i)| < Jn^ici, 



Proof of Lemma \6.5\ 

Since the wavelets are compactly supported, for any fixed k the sum over k' has 
at most AL terms which are non zeros (see lemma [6T2l) . So, the Cauchy-Schwarz 
inequality entails that 

< (suplbill^^ J |(/>ifc| l^jfe'l ] V ( sup||g;|looI] J \<Pjk\\(t>jk' 



We also have 



sup 
I 



< AL sup(||pz|LV||qz|L)- 
I 



X! / <^jfc(^'' ~ 9;) < 22||0||^sup^ / \pi~qi\ 

< 2l(^Jpi + J qi^ \\(b\\oo 25 
= 4i||0|U2i 



Clearly, for any 1 < i < n, 



sup \ci{i)\ < sup ^Y^cf{i) 



□ 
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Lemma 6.6 Let pi, qi pii and qi> be four probability densities in L2. Then, for 
any j G N 



X! (^J ^ikPi - j 4>]kqi^ < '2L\\pi - 



( / (l^jkPi - / (t>]kqi){ / (f>jk'Pi' - / (f>jk'qi') 



E E 

k k'-.IjkrMjy^H 



Proof of Lemma [ 
Using the Cauchy-Schwarz inequality, we have 



< AL^{\\pi-qi\\l + \\pv -qvWl 



< 2L\\pi-qi\\l 



Lemma 16.31 entails that 



E E 



( / <P]kPi - / (f>jkqi){ / (pjk'Pi' - / (pjk'qi') 



< 



E E ( / '^j^p' - / "^3^1') + E E 

^LY^hll {Pi-qif+^LY^Ull [Pv-qv? 



1 

< - 
- 2 

< ^(81^'l|Pi-'Zni2 + 8i'l|Pi'-9Hl2) 



< 4.Lm\pi - qiWl + \\pi, - qi 



'II2 



0jfc-Pi' - / 0jfcgi 



□ 



Lemma 6.7 There exists a constant — Crp{R,L, H^Hoo) > such that 



Proof of Lemma \K71\ 

Let us evaluate each variance 



Var_j _^ (/ij (11,12)) = Cov {hj{ii,i2),hj(ii,i2)) ■ 
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We expand the covariance 

Cov{ iai{ii)(l>jk{Yi^) - bi{ii)(t)jkiZiJ) (a;(i2)?!'jfe(li2) - h{i2)<i)jk{Zi^)) , 
{ai'{ii)(pjk'{YiJ - hi>{ii)(t)jh'{ZiJ) {av{i2)4>jk'{Yi^) - h'{i2)4>jk'{Zi^))) 

= Cov {ai {ii)(j)jk {Yi,)ai {i2)(f)jk (^is ). («i)<^ife' {Yh )«;' («2)<f'jfc' (^"22)) 
■-Cov{ai{ii)(l)jk(Yi^)ai{i2)(l)jk{Yi2),ai^{ii)(f>jk' {Yi^)bi:{i2)4>jk'{Zi^)) 
-Cov (a, iii)(l)jk {Yi^ )ai{i2)4'jk {Yt^)M' {h)4>jk' {Zi^ )av {i2)<Pjk' (Yi^)) 
+Cov iai{ii)<pjk{Yt-,)ai{i2)(pjkiY^2),bi'{ii)(l)jk'iZ^Jbi'{i2)4>jk'{Zi^)) 
-Cov {ai {h)(t>jk (5^n )bi {i2)^jk {Zi^), a-v {h)4>jk' (5^n )av {i2)(t>jk' (Yi^)) 
+Cov{ai{ii)(j)jk{Yi-,)bi{i2)4'jk{Zi2),av{ii)(j)jk' (Yi^)bi^{i2)(t>jk'iZt^)) 
+Cov {ai {ii)(j)jk {Yt^ )bi {i2)(f>jk {Zi^), h' {ii)cfijk' (Zi^ )ai> {i2)(pjk' {Yi^}) 
-Cov [ai (zi)0j7c {Yti )bi {i2)4>jk (Zi, ) , h, {ii)(fijk' {Zr^ )bi, {i2)4']k' {Zi^)) 
-Cov {bi{ii)(pjk(Zi^)ai{i2)4';jk{Yt^),av{ii)(j)jk' {Yt^)av{i2)4'jk'{Yi^)) 
+Cov {bi {ii )(i)jk {Zi^)ai {i2)<Pjk (Y^^), ai' {ii)4>jk' iY^i)bi' ii2)4>jk' {Zi^)) 
+Cov {bi {ii )(j)jk {Zii )ai {i2)(t>jk {Yi^), k' {ii)^jk' {Zi^ )av {i2)(t>jk' {Yi^ )) 
-Cov {bi {ii )(t)jk {Zi-, )ai {12)4' jk {Yi^ ) , h' (ii)4>jk' {Zi^)bi> (12 )<^jfe' {Zi^ )) 
+Cov {bi {ii )(j)jk [Zijbi {i2)4'jk (Zi^ ) , ai' (ii)4>jk' (Yi^ )ai> {i2)4'jk' (Yi^)) 
-Cov {bi{ii)(l)jkiZi^)bi{i2)(l)jk{Zi2),ai,{ii)cf)jk'{Yi^)bi>{i2)(l)jk'iZi2)) 
-Cov {bi{ii)(i)jk{Ziy)i{i2)(pjk{Zr.2),bif{ii)(i)jk'{Zi^)ai\i^^ 
+Cov {bi{ii)(i)jk{Zi^)bi{i2)(p;jk{Zi.^),bi,{ii)(j)jk'{Zi^)bi'{i^^ . 

According to independence arguments, the following terms are clearly equal to 
zero: 

Cow {ai {ii)<pjk {Yi^ )ai {i2)(pjk {Yi^ ) , bv {ii)(fijk' {Zi^ )bi> {i2)4'jk' [Zi^]) , 

Cov (a; iii)(l)jk {Yi^)bi {i2)4'jk {Zi^ ) , {ii)(j)jk' {Zi^)av {i2)<pjk' (Yi^)) , 

Cov {bi {ii)cj)jk {Zi^ )ai {i2)4'jk {Yi^), {ii)4'jk' (Y^^ )bi' [12)4' jk' {Zi^)) , 

Cot; {bi {ii)cj)jk {Zi^ )bi ii2)(t)jk {Zi^),aif {ii)cj)jk' {Yi^ )ai' {i2)(j)jk' {Yi^)) . 

The remaining terms can be split into two types: those involving two different 
random variables and those involving three different random variables. Let us 

handle these two cases separately. First, wc consider the case with two different 
random variables. We need to bound terms such as 

^ ^Cov{ai{ix)(l>jk0^ii)ai{i2)(t>jk{Yi^),ai'{H)43k'{Yi^)a^^ 
= ^^ai{ii)ai {i2)ai, {ii)ai' (12)^ {(j)jk {Yi^ )(t)jk' {YiJ) E {(j)jk {Yi^ )4>jk' {Yi^ ) ) 

ai{h)ai{i2)av{h)av (is)^ {(t>jk{Yi,)) E {<t>jk' {Yi,)) E {(t>jk{Yi^)) E {4jk'{Yi^)) ■ 

n^«2 k,k' 
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As the wavelets are compactly supported, we get for any (ii, 12), 



'^ai{ii)ai{i2)ai'{ii)ai>{i2)E{(t)jk{YiJ(j)jk'{Yi^))E{(l)jk{Yi^)4>jk'{y^^ 

k,k' 

< ll/iilloo^ |a;(ii)a;(i2)a;'(ii)a;'(j2)| \<j)jk4'jk'\fi2 

k,k' •' 

< 2^+'i'll0llLsup(lhllooV|ki|l„o)l«K*i)ai(*2)ai'(*i)ar(*2)|. 

I 

The second sum is much simpler to bound. According to lemma 16.41 it can be 
bounded as follows 

^ ai{ii)ai{i2)ai,{ii)ai,{i2)E {(f>Jk{Y^,)) E (<^,fe'(F^J) E {c^Jk{Y^,)) E {<j)jk' {Y^,)) 

k,k' 

< ^|a^(^l)aK*2)a^'(^l)a^'(^2)|E(|0,fc(r,J|)E(|0,fc,(K,J|)(^/2I2-5)' sup(||p,|loo V |k;|loo) 

k,k' ' 

= L 2^"^ V|a,(ii)a;(i2)ai'(«i)ar(i2)| / |'/>jfc|/»i/ \<P]k'\f^Asup{\\pi\\^V\\qi\\^) 
k,k' ■'i^^ -fh^' V I 

< 8L3|aK«i)ai(*2)ai'(«i)ar(*2)|||</'|lLsup(||pi|loo VlkilD- 

I 

Let us now focus on the sums over ii, 12, 1 and 

'^\ai{ii)ai{i2)ai>{ii)ai'{i2)\ < ^ (Q;(n)^Q;'(»2)^ + ai'{ii)'^ai{i2)'^) 

/' 

< 

We see that this term behaves like n^. The three other terms featuring only two 
different random variables are handled in the same way. 

Therefore it remains to evaluate the eight terms with three different random 
variables. For example, let us consider 

Cow (a; {ii)4)jk {Yi^ )ai {i2)4>]k {Yi^ ) , a;/ {ii)4>jk' {Yi^ )av {i2)4>jk' {Zi^)) , 

and let us omit for a moment the sums over ii, 12, fc, fc', I and V . The covariance 
can be expanded as 

Co,; {(j),k{Y,,)(j,jk{Y,,), <Pjk' iY^, )<Pjk' (Z,,)) = E {cj^^k' {Z,J) E (0^^ (F,, )) Cov (cjjjk {Y^,), (j),k' {Y,,)) . 

When we add the sums over k and fc', the second term is exactly handled as 
the second term above in the case of two different random variables. Thus, 
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it remains to consider the first summand. As above, the compactness of the 
wavelet entails that 



J2 E i'l^jk' {Z,,)) E [ct^.k {Y^,)) Cov {ct>jk {Y^, ) , C^,k' (i^n )) 

fe,fe' 

< ^ |E (Z, J) E {c^.k {Y^,)) E (0,fe (r, J) E (r, J) I 

k,k' 

+ ^ |E (Z, J) E {<t>,u (i; J) E (y, J,/),fc, (y,, )) I 
fe.fe' 

According to lemmas 16.41 and I6.5i we have 

An - ^|E(</.,,.(Z,J)E(0,fe(y,J)E(0,fe(r,J)E(0jfc,(r,J)| 
fc.fc' 

< (^2i-^Lsup(||pj||^ V hiwS) J2 J \^i^\9^^ E / I'l'^k'lU. 

and 

A12 = ^|E(0,,,(z,,))E(0,fe(yj)E((/.,,(yj^,fe,(yj)| 



k,k' 

< 



2i-^Lsup(|b,|U V Ik, ID E |iE('/',/cmj</.,,,(y.j) I 



fe.fe' 



< 4L(^2i-^Lsup(||p,|looV|kHloo))2*||0||oo^ J \^,k\h 

< 8L' (^2Lsup(||p,||^V||q,||^)^ 11011^1 h 

< l6L'mlsnpi\\pi\\^V\\qi\\^). 

I 

It remains to sum over ii and i2 as the sums over I and I' are not important 
(they only change the constant). We have 

'^\ai{ii)ai{i2)ai>{ii)bi>{i2)\ < ^^\ai{ii)ai{i2)ai>{ii)bi>{i2)\ 



^ \Y1 (e«'(^i)'^"(*^)' + E«"(^i)'«'(*^)'J 



< . 
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Clearly, this term behaves like n^. The other covariances involving three ran- 
dom variables are handled exactly in the same way. 

By combining all the previous bounds, we conclude that 

Ai < (fci + fc2 2^)-^, with ki = 224RL^m,, k2 = 32RL''U\\1^. 
As a consequence if we write = ki + k2 one gets 



□ 



Lemma 6.8 There exists a constant = Cr^{R,L, Halloo) > such that for 
any j G N 



I I 
Proof of Lemma \6.8[ 

Clearly, the term ^3 can be bounded as follows 
^3= ^ Cov {hj {ii,i2) ,hj (ii^is)) 

(ii)(/>jfc(Z,J) {ai{i2)(pjk{Yi2) - bi{i2)4>jk{Zi^)) , 

ii^i^^i^ k,k' 1,1' 

^ EE*^°" (ai(«i)<^jfc(i^n) - bi{ii)(j)jk{Zi^),ai'{ii)(j)jk'{YiJ - hv{ii)(j)jk'{Zi^)) 

i-^^i2^i-^ k,k' 1,1' 

xE(a;(i2)0jfe(>^i2) - bi{i2)'f'jk{ZiJ)E{ai{i3)(f>jk'{Yi^) - bv(i3)4>jk'{Z.i^)) 
= ^ ^^Cow(a;(ii)0jfe(>^ii) - bi{ii)(j)jk{Zi^),ai,{ii)(t)jk'{Yi^) - &i'(*i)0jfe' (^ij) 

*i k.k' 1,1' 

¥.{al{l2)^Jk{Y^,) - 6,(i2)0jfc(^»J)E(a('(i3)<^jfc'(^»3) " bv{h)4>jk' {Z^,)) 

- ^ ^^Cou(a;(ii)(/)jfc(yjJ - bi{ii)(t)jk{Z^^),av{ii)(t)jk'{Yi^) - bi'{ii)(l)jk'{ZtJ) 

11=12,13 k,k' 1,1' 

E{ai{i2)(j)jk{Yi^) - 6K«2)0jfc(^i2))IE(ai'(«3)'/'ifc'(^i3) ~ h'{h)<P]k' (Zi^)) 

- ^ ^^Cou(a;(ii)0jfe(yij - bi{ii)(l)jkiZi^),ai>{ii)<pjk'{Yi^) - 

Zl —13 ,12 k,k' l^l' 

E(a;(i2)(/>jfe(Fi2) - bi{i2)(t)jk{Zi^))¥.{ai,{i3)(j)jk'{Yi.^) ~ bi,{i3)(t)jk' (Zi^)) 

+ ^ '^'^Cov {ai{ii)(l)jk{Yi^) " &i(«i)</'jfc(-2'n),ai'(«i)0ifc'(^ii) - bv(ii)4>jk'{Zi^)) 

ix—i2—i3 k,k' 1,1' 

E{ai{i2)(f>jk{Yi^) - bi{i2)(l>jk{Zi2))E{ai'{i3)(f)jk'{Yi.J - bi-{i3)(j)jk'{Zi^)) 
= A31 — A32 — A33 + A34 
< IA31I + IA32I + IA33I + IA34I. 
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We will separetely bound each term. 

Let us start with |^3i|. The first step is to expand the covariance. 

«1, 22,^3 k^k' 1,1' 

E(ai(i2)0jfc(>^j2) - bi{i2)(t'jk{Zi^)) \ 
= I EEE'^'^" ((ai(«i)</'jfc(^ii) - bi{ii)(t>jk{Zi^)) , {ai-{ii)(t>jk'{yti) - bi>{ii)(j)jk'{Zi 

ii k,k' 1,1' 

( / - / (t>jkqi){ / (l^jk'Pi' - / (j^jk'Qi') 



ii k,k' 1,1' 

-E {ai (y, J) E {ai, (F, J) - E (6, J) E (zi)^,^, (Z, J) ] 

(/ 4>jkPi - I (l^jkqiji (t>jk'Pi' - / (t>jk'qi') 



The first two terms involve only one expectation and can be bounded in the 
same way. Therefore let us bound the quantity 

^^^E {ai{ii)(l)jk{Yt^)ai' {ii)<j)jk' {Y,J) ( J (j)jkPi - J <l>jkqi){ j (pjk'Pv ~ J (f>jk'qi') 

ii k,k' 1,1' 



Clearly E \ai{'ii)ai'{ii)\ < nJ {ai,ai),^ {ai',ai>)^ < 



M 



Since \E {(l)jk{YiJ(j)jk' (YiJ) \ < sup(||p;||oo V ||g/||oo), lemma Ull] entails that 



E E ( / ^^''P^ ~ / '^j'^qOi / (f'jk'Pi' - I (f'jk'qi') 

Then one deduces that for any 1 < ii < n 



<^Lm\pi~qi\\l + \\pi,-qi,\\l 



J2J2\^('i'MYri)<Pjk'iY.,))\ 

k,k' 1,1' 

< SL^supdhlloo V llgHloo) Ell^^'-^'ll 
' I 

Hence 



(/ (t^jkPi- / (t>jkqi){j 4>]k'Pi' - I 4>]k'qi') 



Yl^^^'^°'i'^^^)'t^ok{Yn)av{ii)(t)jk'{Yi^)){j (pjkPi- j (t>jkqi){J (pjk'Pv 

ii k.k' 1,1' 



< 



K 



(t>]k'qi') 



sup(b/||oo V llgilloo) ^\\pi~qi\\l n. 
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Now we come to the last two terms which involve two expectations. Let us 
consider for example the quantity 



k,k' 
< 



^E{(j)jkiYi,))E{(pjk'iYi,)) {J (pjkPi- J (t>jkqi){j (jjjk'Pr - J (t)]k-qi') 

^ \&{4>Jk{Y^,))E{4>Jk'{Y^,))\ <t>jkPi - J cl)jkq?j + l^j c^jk'Pv - J <P,k'qv 



< sup 

LI' 



2i(bi||oo V 



llgHloo)^ 2-i^|E(0,,(y,j)|^(/', 



^Jk'pl' 



\\pi' - qv 

\ I J I' 

Last inequalities are obtained by using lemma [6761 for any 1 < ii < n. Hence 



Pjk'qv 



< 4V2L2 11011^ (sup(||p;||oo V ||(7/||oo)) sup 



^^^E(a,(zi)0,,(K,J)E(ap(zi)0,,,(r,J)(y' , 

ii k.k' l.l' 



(t)jkPi- / (f>]kqi)i (l>jk'Pi' - I (t)jk>qi') 



< 



4^/2A-/ 



Li||<^|loo (sup(|b,|looV |l<Z/|loc) 



sup Hp;/ - grlla 
I' 



K \ I 

Therefore the two last bounds entail that 



|^3i| < C31— ^^IIp; -g;|l2' wl^ere c^i = AL'' VR. [2^ + V2L\\4>\\ 
The way to bound A32 and A33 is trickier. We have 

1^32! < I X! X! X! [ai{ii)ai' [ii)<Cov {(l)jk{Yi^), (t>jk' (Y^J) + bi{ii)bi,{ii)Cov {(j)jk{Zi^),(j)jk' {Zi^))] 

Id' «i ■i2 k,k' 

< I [ai{ii)ai> {ii)Cov {(j)jkiYiJ, (t)jk' (Y^i)) + bi{ii)bi> {ii)Cov ((/ijfc(ZjJ, 0^^/ (ZjJ)] 

21 k.k' 



[ai{ii)E{(l>jk{YiJ) - bi{ii)E{(l)jk{Zi^))] [n / (jjjk'Pv - n I (jjjk'qi 
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+ 1 E E E ^^^^ (^'1 )) ^ ('^^■fc' (^'1 )) 

(a;(ii)E(0jfe(yiJ) - 6((ii)E(0jfe(Z,J)) [n I (f>jk'Pv - n I (fyjk'Qi' 



+ 1 E E E h{^i)bu{ti)E )^,k' (Z.J) 

l.l' ii k,k' 



(a;(ii)E(0jfc(yiJ) - 6;(ii)E(0jfe(ZiJ)) j 4>jk'Pv - n j (jyjk'qi 
+ 1 E E E &Kn)&r(*i)E E (Z, J) 

l.V i\ k,k' 



The calculations are rather lengthy and involve eight terms. But the bright side 
is that the terms can be split into two groups. There are terms involving two 
expectations such as 

1,1' i\ k,k' ^ 

and terms involving three expectations such as 

1,1' ii k,k' 

Still using lemmas 16.41 and 16.51 '^e have 



(t>jk'qi' , 



n I 4)jk'qi 



EEE"'(*i)"''(*i)^(^J'^(^^i)<^Jfc'(^H))ai(ii)E(0jfe(r,J) fn J c^,k' 

1,1' ii k,k' ^ 

M / M \i 

< 8V2L^misup\\pi\\iY,\\pi-mh[T. 



pv -n / (j)jk'qi 



{ai,ai 



22 n2 



1=1 



\i=i 



< 



M 



L'\\(t>\\i,s^ip\\pi\\-^2^\\pi~qi\\2 25 n2; 



^3 
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^^^ai{ii)ai'{ii)E{<l)jk{Yt^)(j)jk'{YiJ)bi{ii)E{(l}jk{ZiJ) in / (j)jk'Pi'-n / (l)jk'qi' 

' " ii k,k' ^ 



M 



M 



2 M 



^1=1 



1=1 



< 



Li ml sup WmWi ^ lb; - m\\, 2* ni; 



M 



1=1 



n I 4)jk'Pi' -n (pjk'Qi' 

1,1' ii k,k' ^ •' J / 

M / M \ i 

< 8x/2Li||<A||^sup||<z,|liEllf'-«ll2 E<^''^')J 2ini 
' ;=i \;=i / 

' 1=1 

^^^bi{ii)bi'{ii)E{(pjk{ZiJ(j)jk'{Zi^))ai{ii)E{(j)jk{YiJ) in / (pjk'Pi'-n / (pjk'qi'j 
1,1' ii k,k' ^ J / 

M / M \ 5 M 

4^/2 it 11011^ sup Iblli ^ - qih E ^')n E + ^')n) 2* n 



< 



< 



1=1 



1=1 



M 



' 1^1 



Next we come to the second term. We have 



« / <t>ik'Pi' -n (l>3k'(lV 

1,1' ii k,k' ^ •' •' 

M / M \ i 

AV2\miLi sup ibdii E IIP' - «'ii2 E «')n 2 



< 



< 



J2 n2 



4^2— ||</.||^Lisup|b;||i^|b;-gHl2 2* ni, 



1=1 
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^^^ai{ii)ai'{ii)E{<l)jk{YtJ)E{(j)jk'iYiJ)bi{ii)E{(j)jk{Z,^)) in / (j)jk'Pi' - n / (j)jk'qi' 
1,1' ii k,k' ^ 

M I M \ 5 M 



< 



;=i \i=i / 1=1 

M 



^\j'^^ml.LisM\qi\\iY,\\Pi-<li\\2 2* ni 



Y.Y.Y.^i^''^)^i'^^^)^^'i'ik{Zi,))E{cf>jk'{Zi,))hi^^^^ (n j (t>jk'Pl'-n j c^jk'qA 

1,1' ii k,k' ^ J / 



M / M \ 2 

< 4V2\mlLUup\\qi\\iY,\\pi-qi\\JY,{biM}J 2^ ni 
' 1=1 J 



< ^j2-^\mlLisup\\qi\\i^\\pi-qi\\, 2^71^, 
S S S Hii)bi'{ii)E {<j>jk{Zi,)) E {<j>jk'{Zi,)) ai{h)E {<j>jk{Yi,)) 



ii k,k' 

M / M 

< 2V2||0||^Ltsup|h||i^|h-gdl2 E(^"^')n E((«"«')n + (^''MJ 2ini 

' 1=1 \i=i J 1=1 

< W2-3||</.||^Lisup|b,|liElb'-«ll2 2*ni. 



IA32I < C32 ( ^ ) ' 2^ nt E lb* - 9«ll2 . 



All these bounds entail that 

' (=1 

with C32 = 48\/2i?L5j|(^||^. As a consequence, one similarly gets 
1^331 < C33 (^^y 2i ni E lb' - 9HI2 . 



with C33 = C32. 

Let us now consider |^34|. 

|^34| < I EEE*^"^ (ai(H)0jfe(^ii) - bi{h)(j)jk{ZiJ,ai'{ii)(j)jk'{YiJ - bii{ii)(l>jk'{Zi^)) 

ii k,k' 1,1' 

E {ai{ii)(t)jk{Yi,) - bi{ii)(l>jk{Zi,))E{ai,{ii)(l)jk' (FjJ - • 
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Once again, we apply the Cauchy-Schwarz inequality, 

n ^ / 4>jk'{pi' - qi') 

< n^Wpi' - qi,\\2. 
I' 

According to lemma l675l we have for any I < ii < n and any /, 



ii I' 



V \bl{^l)\ 



E 



Pjkgi 



< 2L(|a^(^l)| V|6,(ii)|) 2^ ||0|| 



M 



M 



< 2L ||0||oo2M ^(a/,a,)„V^(fej,6j)„ 



1=1 



IM 



< 2L\ — UWool^ V^. 
V A 

According to lemmas 16.41 and 16.51 we have for any fixed fc, 



4>jkfii 



E 



4>jk' fii 



EE ''^'(^l)"''^*!)! ( / ^jk<l>jk'ft, 
ii k' ^ 

< (ai,ai)„ (ai',ar)„ / (l)jk(t>jk' fi^ 

\ k' 

< ^41, sup(|||5;||oc V ||q;||oc) + (2L)5 ||0||^ sup(||p/||oc V ||q/||oo)^^ 

< ^ ^4L Sup(||p;||oo V ||gr;||oc) + (2L)^||(?;'||ooSup(||p;||oo V ||(7;||oo)^^ n 



and 



< ny^(6,,6,)„(6,,6,)„ (^E|/ 



<l>jk<Pjk'gh 
(t>jk<Pjk'gii 



+ 



4>jk9ii 



E 



< n^{bhbi)^{bi,,h)^ i^iL sup(||pi||ooV||gi||oo) + (2L)^||<^||ooSup(||p,||ooV||g,||oo)^ 

< ^ f4L sup(||p,|looV||g,||oc) + (2i)^|10||ooSup(|bHlooV||(?,|loo)^') n. 



Hence, 



IA34I < C34 25 ( ^ j niJ2\\P'-1ih^ 
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with C34 = 4L||,^||oo\/:R (ALVR+{2L)i\\cP\\^y 

When we carefully look at the bounds of A^i for i e {1, 2, 3, 4}, we deduce that 
there exists a Cste > such that 

; I 

4 

with = ^ C3i. □ 



- M2 



References 

[I] Autin, F. (2006). Maxiset for density estimation on R. Math. Methods 
Statist., vol. 15 (2), 123-145. 

[2] Avellaneda, M. (1999, 2000, 2001) Quantitative Analysis in Financial Mar- 
kets: Collected Papers of the New York University Mathematical Finance 
Seminar Volumes I,II, III, World Scientific. 

[3] Bernhard, W., and Leblang, D. (2006). Democratic Processes and Financial 

Markets, Cambridge University Press, New York. 

[4] Butucea, C, and Tribouley, K. (2006). Nonparametric homogeneity tests. 
J. Statist. Plann. and Inference, vol. 136, 597-639. 

[5] Cohen, A., DeVore, R., Kerkyacharian, G., and Picard, D. (2001). Maxi- 
mal spaces with given rate of convergence for thresholding algorithms. Appl. 
Comput. Harmon. Anal., vol. 11 (2), 167-191. 

[6] Cont, R. (2007). Volatility clustering in financial markets: empirical facts 
and agent-based models. In Long memory in economics (eds. A. Kirman and 
G. Teyssiere), pp 289-309. Springer, Berhn. 

[7] Daubechies, I. (1996). Ten Lectures on Wavelets, SIAM, Philadelphia. 

[8] Delmas, C. (2003). On likelihood ratio tests in Gaussian mixture models. 
Indian J. Statist, vol. 65 (3), 513-531. 

[9] Donoho, D., Johnstone, I., Kerkyacharian, G., and Picard, D. (1996). Den- 
sity estimation by wavelet tresholding. Ann. Statist., vol. 24 (2), 508-539. 

[10] Garel, B. (2001). Likelihood ratio test for univariate Gaussian mixture. J. 

Statist. Plann. Inference, vol. 96 (2), 325-350. 

[II] Garel, B. (2005). Asymptotic theory of the likelihood ratio test for the 
identification of a mixture. J. Statist. Plann. Inference, vol. 131 (2), 271-296. 



41 



[12] Hall, P. (1981). On the nonparamctric estimation of mixture proportions. 
J. Roy. Statist. Soc. Ser B, vol. 43 2), 147-156. 

[13] Hall, P., and Titterington, D. M. (1984). Efficient Nonparametric Estima- 
tion of Mixture Proportions. J. Roy. Statist. Soc. Ser. B, vol. 46 (3), 465-473. 

[14] Hall, P., and Zhou, X.H. (2003). Nonparametric estimation of component 
distributions in a multivariate mixture. Ann. Statist., vol. 31 (1), 201-224. 

[15] Hosmer, D.W. (1973). A comparison of iterative maximum likelihood esti- 
mates of the parameters of a mixture of two normal distributions under three 
types of sample. Biom,etrics, vol. 29, 761-770. 

[16] Lodatko, N., and Maiboroda, R. (2007). Estimation of the density of a 
distribution from observations with an admixture. Theory Probab. Math. 
Statist, vol. 73 , 99-108. 

[17] McKnight, P.E., McKnight, K.M., Figueredo, A.J., and Sidani, S. (2007). 
Missing data: a gentle introduction. Guilford Press, New York. 

[18] Maiboroda, R.E. (2000). A homogeneity criterion for mixtures with varying 
concentrations. Ukrainian Math. J., vol. 52 (8), 1256-1263. 

[19] Maiboroda, R.E. (2000). An asymptotically effective estimate for a distri- 
bution from a sample with a varying mixture. Theory Probab. Math. Statist., 
vol. 61, 121-130. 

[20] Pokhyl'ko, D. (2005). Wavelet estimators of a density constructed from 
observations of a mixture. Theor. Prob. and Math. Statist, vol. 70, 135-145. 

[21] Gayraud, G., and Pouet, C.(2005). Adaptive Minimax Testing in the Dis- 
crete Regression Scheme. Probab. Theory Related Fields vol. 133 (4), 531-558. 

[22] Qin, J. (1999). Empirical likelihood ratio based confidence intervals for 
mixture proportions. Annals of Statist., vol. 27 (4), 1368-1384. 

[23] Spokoiny, V.G. (1996). Adaptive hypothesis testing using wavelets. Ann. 
Statist., 24 (6), 2477-2498 

[24] Titterington, D.M. (1983). Minimum distance nonparametric estimation of 
mixture proportions. J. Roy. Statist. Soc. Ser. B, Series B, vol. 45 (1), 37-46. 

[25] van de Geer, S. (1995). Asymptotic normality in mixture models. ESAIM 
Probab. Statist., vol. 1, 17-33. 



42 



